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Abstract 

Most  test  data  adequacy  criteria  based  upon  path  selection  have  the  unfortunate  property  that 
for  some  programs  with  unexecutable  paths,  no  set  of  test  data  is  adequate.  In  this  paper  we  define 
a  new  family  of  adequacy  criteria,  derived  from  the  data  flow  testing  criteria,  which  circumvent  this 
problem  by  only  requiring  the  test  data  to  exercise  those  definition-use  associations  which  are 
executable.    The  inclusion  relationship  among  these  criteria  is  explored. 

1.    Introduction 

Several  software  test  data  adequacy  criteria  are  based  on  the  idea  that  one  cannot  consider  a 
program  to  be  adequately  tested  if  no  test  data  has  caused  certain  sequences  of  statements  to  be 
executed.  These  methods  generally  associate  a  subset  T  of  the  input  domain  of  a  program  P,  with 
the  the  set  ,P  of  paths  through  P's  flow  graph  which  are  executed  when  the  program  is  run  with  inputs 
from  T.  The  test  T,  or  equivalently  the  set  of  paths  P,  is  said  to  satisfy  criterion  C  for  program  P 
("T  is  C-adequate  for  P")  if  and  only  if  each  of  the  sequences  required  by  C  is  a  subpath  of  one  of 
the  paths  in  P. 

The  most  well-known  of  these  criteria  are  statement  testing,  branch  testing  and  path  testing 
which  require  that  the  test  data  cause  every  node  (respectively  branch,  path)  in  the  program's  flow 
graph  to  be  executed  [HOW78,HUA75].  Unfortunately,  statement  and  branch  testing  can  fail  to 
expose  many  common  errors  and  path  testing  is  usually  infeasible  since  programs  with  loops  may 
have  infinitely  many  paths  [G0075,HOW76].     Several  criteria  which  are  based  on  analysis  of  the 
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program's  control  flow  and  which  are  stronger  than  branch  testing  but  weaker  than  path  testing  have 
been  proposed  [HOW78,MIL74,WOO80]. 

Recently,  a  number  of  test  data  adequacy  criteria  which  are  based  on  data  flow  analysis  and 
which  "bridge  the  gap  between  branch  testing  and  path  testing  have  been  proposed  and  studied 
[RAP82,RAP85,LAS83,NTA84,CLA85].  Tools  based  on  some  of  them  have  been  implemented 
[FRA85a,FRA85b,GlR85,KOR85].  These  criteria  are  based  on  the  intuition  that  one  should  not 
feel  confident  that  a  variable  has  been  assigned  the  correct  value  at  some  point  in  the  program  if  no 
test  data  causes  the  execution  of  a  path  from  the  assignment  to  a  point  where  the  variable's  value  is 
subsequently  used. 

All  of  these  criteria  suifer  from  the  weakness  that  for  programs  with  unexecutable  paths,  it  may 
be  impossible  for  any  test  set  to  satisfy  the  given  adequacy  criterion.  For  example,  consider  a 
program  having  a  for  loop  in  which  the  upper  bound  is  always  greater  than  or  equal  to  the  lower 
bound.  Such  a  program  has  unexecutable  paths  and  therefore  cannot  be  adequately  tested  using  the 
path  testing  criterion.  Our  experience  has  shown  that  for  many  programs,  unexecutable  paths  make 
it  impossible  for  any  test  to  satisiy  a  given  data-flow  testing  criterion  [FRA85a,FRA85b]. 

This  is  clearly  an  undesirable  situation.  One  property  which  one  would  expect  a  "good" 
adequacy  criterion  C  to  have  is  the  applicability  property:  for  every  program  P  there  exists  some  test 
which  is  C-adequate  for  P  [WEY85].  Not  only  does  the  applicability  property  fail  for  these  criteria, 
but  for  each  of  them  it  is  undecidable  for  a  given  program  whether  test  data  exists  which  adequately 
tests  the  program. 

In  this  paper  we  define  a  new  family  of  adequacy  criteria,  derived  from  the  data-flow  testing 
criteria  proposed  in  [RAP82,RAP85].  Roughly  speaking,  for  each  of  these  criteria,  a  test  is 
adequate  if  and  only  if  it  comes  "as  close  as  possible"  to  satisfying  the  corresponding  data  flow 
testing  criterion.  These  criteria  will  be  defined  precisely  and  the  relationships  between  them  will  be 
explored  in  section  3.  In  section  2  we  summarize  the  theory  of  data-flow  testing,  extending  it  to 
apply  to  programs  written  in  Pascal.  In  section  4  a  new  data  flow  testing  criterion  is  introduced  and 
its  properties  are  examined. 
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2.    Data  Flow  Testing 

A  family  of  test  data  selection  criteria,  each  based  on  the  program's  data  flow  characteristics, 
was  defined  in  [RAP82]  and  [RAP85]  for  a  very  simple  universal  programming  language  consisting 
of  assignment  statements,  conditional  and  unconditional  transfer  statements,  and  input  and  output 
statements.  These  criteria,  which  we  call  data  flow  testing,  or  DF  testing  for  short,  require  that 
certain  paths  from  a  variable's  definition  to  its  subsequent  uses  be  selected.  A  tool,  ASSET,  which 
performs  DF  testing  on  programs  written  in  such  a  language  is  described  in  [FRA85a].  We  have 
extended  DF  testing  to  apply  to  Pascal  programs  and  enhanced  ASSET  accordingly.  We  now 
summarize  the  extended  theory  of  data  flow  testing. 

We  apply  DF  testing  to  an  individual  subprogram,  i.e.,  a  main  program,  a  procedure,  or  a 
function.  While  treating  each  array  element  as  an  individual  variable  would  in  many  cases  lead  to 
the  selection  of  better  test  data  [FRA85b]  doing  so  is  not  practical.  We  therefore  treat  each  array  as 
a  single  entity.  Similarly,  occurrences  of  pointer  variables  are  analyzed  purely  syntacticly;  no  anempt 
is  made  to  identify  the  object  to  which  the  pointer  points.  Each  field  of  a  record  is  treated  as  an 
individual  variable.  Any  unqualified  occurrence  of  a  record  is  treated  as  an  occurrence  of  each  field 
of  the  record.  As  a  technical  convenience  we  assume  that  the  subprogram  does  not  have  goto 
statements,  with  statements,  variant  records,  functions  having  variable  parameters,  or  procedure  or 
function  parameters.  We  also  assume  that  in  every  conditional  statement  at  least  one  variable  occurs 
in  the  boolean  expression  which  determines  the  flow  of  control. 

A  subprogram  can  be  uniquely  decomposed  into  a  set  of  disjoint  blocks  of  statements.  A  block 
is  a  maximal  sequence  of  simple  statements  havmg  the  properties  that  it  can  only  be  entered  through 
the  first  statement  and  that  whenever  the  first  statement  is  executed,  the  remaining  statements  are 
executed  in  the  given  order.  The  subprogram  to  be  tested  is  represented  by  a  flow  graph  in  which 
the  nodes  correspond  to  the  blocks  of  the  subprogram,  and  edges  indicate  possible  flow  of  control 
between  blocks.  Figure  1  shows  the  subgraphs  corresponding  to  statements  in  the  language.  The 
subprogram's  flow  graph  is  obtained  by  merging  the  exit  node  of  each  statement  with  the  entry  node 
of  the  following  statement.    An  entry  node  preceding  the  first  statement  of  the  procedure  and  an  exit 


PlG^i^l^E^  i 


(f))   denotes  the  subgraph  corresponding  to  S,. 


■ssignmcnt  stttcment 
V  :=  expr. 


procedure  call 
P(jt, ^X 


input/output  statements 


rcadCxp  ...xj; 
read(f,j:,,.-.,JrJ; 
rcadln(X|,  ..,x„); 
rcadln(f,ar|,...,x„); 


write(x,,...,Xn); 
write(f,J:,    ...xj; 

writeln(x ,x„)\ 

wr»teln(f,^i,.-  ,Jf„); 


conditional  statements 

if  boolean  expr  then  S, 


if  boolean  expr  then  5, 
else  S^; 


case  boolean  expr  of 
/         I   -s,- 


end; 


] ifJode-naTi  c-use  of  e"icl»"^ariable  in  ex, 

followed  by  a  definition  of  v. 


node  j  has  c-uscs  of  j, x„  followed  I 

definitions       of       x, ,  .  .  .  .x,_       where       tl 

parameters  in  positions  i «,„  are  variab 

parameters  and  the  rest  are  value  parameters 


Node  i  has  definitions  of  Jt, '«■    If  '''^ 

variable  f  is  present  node  i  has  a  c-use  of  f 
followed  by  a  definition  of  f;  otherwise  node  i 
has  a  c-use  of  input  followed  by  a  definition  of 
input. 


Node  i  has  a  c-usc  of  x,, 


,x„. 


Let  k  be  the  entry  node  of 

Edges    (i,j)    and    (i.k)    have    p-uses    of    ca 
variable  in  boolean  expr. 


Let  j,k  be  the  entry  nodes  of  and 

Edges    (i,j)    and    (i,k)    have    p-uses    of    ea 
variable  in  boolean  expr. 


Let  n, 


,n     be  the  entry  nodes  of 


Edges    (i,n,) {»■",.)    have    p-uses    of    ca 

variable  in  boolean  expr. 


repetitive  ■tatements 

while  boolean  expr  do  S^■, 


repeat  5,; 
^2: 


until  boolean  expr; 


Let  k  be  the  entry  nod^  of 

Eges    (i.j)    and    (i.k)    have    p-uses    of    each 
variable  in  boolean  expr. 


\ 


\    Let  j  be  the  entry  node  of 


;    and  let  k  be  the  exit  node  of 


Eges    (k,j)    and    (k,i)    have    p-uses    of    each 
variable  in  boolean  expr. 


for  V  :=  expr  J  to  expr2  do  5, 


Let   g    and    h    be 
respectively  of 


the 


entry    and    exit    nodes 


Node  i  has  c-uses  of  all  variables  in  exprl 
followed  by  a  definition  of  v.  Edges  (j,g)  and 
(j.k)  have  p-uses  of  v  and  of  all  variables  in 
expr2.  Node  h  has  a  c-use  of  v  followed  by  a 
definition  of  v. 
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node  succeeding  the  last  statement  are  added. 

Data  flow  analysis  was  originally  used  for  compiler  optimization,  and  generally  classifies  each 
variable  occurrence  as  being  either  a  definition,  an  undefinition  or  a  use  [HEC77,SCH73].  In 
addition,  we  distinguish  between  two  substantially  different  types  of  uses.  The  first  type  directly 
affects  the  computation  being  performed  or  allows  one  to  see  the  result  of  some  earlier  definition. 
We  call  such  a  use  a  computation  use  or  a  c-use.  Of  course,  a  c-use  may  indirectly  affect  the  flow  of 
control  through  the  subprogram.  In  contrast,  the  second  type  of  use  directly  affects  the  flow  of 
control  through  the  subprogram,  and  thereby  may  indirectly  affect  the  computations  performed.  We 
call  such  a  use  a  predicate  use  or  p-use.  Figure  1  shows  the  classification  of  variable  occurrences  in 
the  language's  statements. 

We  are  interested  in  tracing  the  flow  of  data  between  nodes,  and  thus  define  a  c-use  of  a 
variable  x  in  node  i  to  be  a  global  c-use  if  the  value  of  x  has  been  assigned  in  some  block  other  than 
block  i.  Let  x  be  a  variable  occurring  in  a  subprogram.  A  path  ((,n,,...,n„,7),  m^O,  containing  no 
definitions  or  uiidefinitions  of  x  m  nodes  n^,...,n,^  is  called  a  definition  clear  path  with  respect  to  x 
[def-clear  path  wrr  x]  trorxi  code  i  tc  node  j  and  from  node  i  to  edge  (n,„,7).  A  node  i  has  a  global 
definition  of  a  variable  x  if  it  ha5  a  definition  of  x  and  there  is  a  def-clear  path  wrt  x  from  node  i  to 
some  node  containing  a  global  c-use  or  edge  containing  a  p-use  of  x.  Since  every  p-use  is  associated 
with  a  potential  transfer  of  control  from  one  node  to  another,  there  is  no  need  to  distinguish  between 
p-uses  and  global  p-uses.  The  subprogram's  def-use  graph  is  obtained  by  associating  with  each  node 
i,  the  sets  c-use(i)  and  def(i)  defmed  in  Figure  2,  and  with  each  edge  (i,j)  the  set  p-use(i,j).  In 
addition,  the  entry  node  is  considered  to  have  a  global  definition  of  each  parameter,  each  non-local 
variable  which  occurs  in  the  subprogram  and  the  text  variable  input  which  may  be  implicitly  used  in 
read  or  readln  statements  The  exit  node  has  an  undefinition  of  each  local  variable  and  a  c-use  of 
each  variable  parameter. 

A  definition-c-use  association  is  a  triple  (i,j,x)  where  i  is  a  node  containing  a  global  definition  of 
X  and  i  €  dcu(x,i).  A  defmition-p-use  association  is  a  triple  (i,(j,k),x)  where  i  is  a  node  containing  a 
global  definition  of  x  and  (j,k)  €  dpu(x,i).    A  simple  path  is  one  in  which  all  nodes,  except  possibly 
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V 

= 

the  set  of  variables 

N 

= 

the  set  of  nodes 

E 

= 

the  set  of  edges 

def(i) 

~ 

{x  €  V  1   X  has  a  global  definition 
in  block  i} 

c-use(i) 

= 

{x  €  V  1   X  has  a  global  c-use  in  block  i} 

p-use(i,j) 

= 

{x  €  V  1   X  has  a  p-use  in  edge  (i,j)  } 

dcu(x,i) 

= 

{]  €  N  1   X  €  c-use(j)  and  there 

is  a  def-clear  path  from  i  to  j} 

dpu(x,i) 

= 

{(j,k)  €  E  1  X  €  p-use(j,k)  and  there  is 
a  def-clear  path  from  i  to  fj,k)  } 

Figure  2 

the   first  and   last,   are   distinct.     A  loop -free  path  is  one  in  which  all  nodes  are   distinct.     A  path 
(ni,...,n,,n;)  is  a  du-path  with  respect  to  a  variable  x  if  n ,  has  a  global  definition  of  x  and  either 

i)        rtj  has  a  c-use  of  x  and  (n^,...,n,,n^)  is  a  def-clear  simple  path  with  respect  to  x,  or 

ii)       (i,."*)  h^s  2  p-use  of  X  and  (n,,. ..,«,)  is  a  def-clear  loop-free  path  with  respect  to  x. 

An  association  is  a  definition-c-use  association,  a  definition-p-use  association,  or  a  du-path. 

A  path  IT  covers  a  definition-c-use  association  (i,j,x)  [respectively  a  defirition-p-use  association 
(i,(j,k),x)]  if  it  has  a  definition-clear  subpath  with  respect  to  x  from  i  to  j  [respectively,  from  i  to 
(j,k)].  IT  covers  a  du-path  tt'  if  it'  is  a  subpath  of  t:.  A  set  P  of  paths  covers  an  association  if  some 
element  of  the  set  does. 

A  path  through  a  subprogram's  flow  graph  (which  we  shall  refer  to  in  the  sequel  as  a  path 
through  the  subprogram)  is  a  control  path  if  its  first  node  is  the  entry  node  and  its  last  node  is  the 
exit  node.  A  path  is  executable  or  feasible  if  there  exists  some  assignment  of  values  to  input 
variables,  non-local  variables,  and  parameters  which  causes  the  path  to  be  executed.  According  to 
this  definition,  the  question  of  whether  or  not  a  given  path  through  a  subprogram  is  executable  is 
independent  of  the  context  in  which  that  subprogram  is  called.  However,  it  may  depend  on  the 
effects  of  any  procedures  or  functions  which  are  called  along  the  path.  Note  that  whether  or  not  a 
particular  path  is  executable  depends  on  the  actual  subprogram,  not  just  on  its  def-use  graph.  Since 
the  sets  of  paths  to  which  we  are  applying  the  criteria  arise  from  the  execution  of  test  data  we  will 


always  assume  that  they  are  sets  of  executable  control  paths. 

Roughly  speaking,  the  family  of  data  flow  testing  criteria  is  based  on  requiring  that  the  test  data 
execute  definition-clear  paths  from  each  node  containing  a  global  definition  of  a  variable  to  specified 
nodes  containing  global   c-uses   and  edges  containing   p-uses  of  that  variable.     For  each   variable 

definition  we  can  demand  that       "      I  definition  clear  paths  with  respect  to  that  variable  from  the 

[..   -,  \    uses 

"      I  of  the  \c-  us 

somel  I 

■•  [p  -  us 

The  criteria  are  defined  precisely  in  Figure  3. 


tses 
uses 
tses 


reachable  by  some  such  path  be  executed. 


THE  DATA  FLOW  TESTING  CRITERIA 

Test  T  satisfies  criterion  C  for  subprogram  P  if  for  each  node  i  and  each  x  €  def(i)  the  set  F  oi  paths 
executed  by  T  covers  the  following  associations; 


CRITERION 
All-defs 

All-p-uses 
All-p-uses/some-c-uses 

All-c-uses/some-p-uses 

All-uses 

AU-du-paths 


ASSOCIATIONS  REQUIRED 

Some  (i,j,x)  s.t.  j  €  dcu(x,i) 

or  some  (i,(j,k),x)  s.t.  (j,k)  €  dpu(x,i). 

All  (i,(j,k),x)  s.t.  (j,k)  €  dpu(x,i). 

AH  (i,(j,k),x)  s.t.  (j,k)  €  dpu(x,i). 

Iii  addition,  if  dpu(x,i)  =  4>  then  some  (i,j,x)  s.t.  j  €  dcu(x,i). 

All  (i,j,x)  s.t.  j  €  dcu(x,i). 

In  addition,  if  dcu(x,i)  =  <t>  then  some  (i,(j,k),x)  s.t.  (j,k)  €  dpu(x,i). 

All  (i,j,x)  s.t.  j  €  dcu(x,i) 

and  all  (i,(j,k),x)  s.t.  (j,k)  €  dpu(x,i). 

All  du-paths  from  i  to  j  with  respect  to  x  for  each  j  €  dcu(x,i) 

and  all  du-paths  from  i  to  (j,k)  with  respect  to  x  for  each  (j,k)  €  dpu(x,i) 


For  comparison  we  also  define  the  criteria  all-nodes  (respectively  all-edges,  all-paths)  which  require 
that  /'cover  every  node  (respectively  every  edge,  every  path)  in  the  flow  graph. 


Figure  3 

Criterion  C,  includes  criterion  C;  if  and  only  if  for  every  subprogram,  any  test  which  satisfies 

C,  also  satisfies  C;.  Criterion  C,  strictly  includes  criterion  C,  denoted  C,  =>  C,,  if  and  only  if  C, 
includes  C;  and  for  some  subprogram  P  there  is  a  test  which  satisfies  C,  but  does  not  satisfy  C,.  The 
notion  of  subsumption  in  [CLA85]  is  identical  to  our  notion  of  inclusion. 


-  7- 

Rapps  and  Weyuker  proved  that  for  the  simple  language  for  which  DF  testing  was  originally 
defined,  subject  to  certain  minor  syntactic  restrictions,  the  relationship  among  the  criteria  is  as  shown 
in  Figure  4  [RAP82,RAP85].  Clarke  et.  al.  [CLA85]  have  shown  the  relationship  of  the  criteria 
defined  by  Laski  and  Korel  [LAS83]  and  Ntafos  [NTA84]  to  the  DF  criteria.  We  have  extended  the 
theory  of  DF  testing  in  such  a  way  that  these  relations  are  preserved.  Doing  so  required  the 
inclusion  of  definitions  of  all  non-local  variables  in  the  entry  node  of  the  procedure  and  careful 
treatment  of  implicit  uses  of  the  text  variable  input. 

3.    The  Feasible  Data  Flow  Testing  Criteria 

Given  a  subprogram  P  and  a  DF  criterion  C  it  may  be  the  case  that  no  test  data  for  P  exists 
which  satisfies  C.  This  occurs  when  none  of  the  paths  which  cover  some  association  required  by  C 
are  executable.  In  such  a  case,  P  cannot  be  adequately  tested  according  to  C.  In  this  section  we 
introduce  a  new  family  of  criteria  which  are  derived  from  the  DF  C'i'eria  and  which  circumvent  this 
problem  and  investigate  some  of  its  properties. 

We  will  say  that  an  association  is  executable  if  there  is  some  executable  path  which  covers  it;  it 
is  unexecutable  otherwise.  We  define  subsets  fdcu(x,i)  C  dcu(x,i)  and  fdpu(x,i)  C  dpu(x,i) 
consisting  only  of  those  associations  which  are  executable.  For  each  DF  criterion  C  we  define  a  new 
criterion  C*  by  selecting  the  required  associations  from  fdcu(x,i)  and  fdpu(x,i)  instead  of  from 
dcu(x,i)  and  dpu(x,i).  Precise  definitions  of  these  criteria  are  given  in  Figure  5.  We  will  refer  to  the 
criteria  {(all-du-paths)*,  (all-uses)*,  (all-p-uses/some-c-uses)*,  (all-c-uses/some-p-uses)',  (all-p- 
uses)*} as  feasible  dataflow  testing  criteria,  or  FDF  criteria,  for  short. 

The  FDF  criteria  satisfy  the  applicability  property:  For  any  subprogram  P  and  any  FDF  criterion 
C*,  there  is  some  (possibly  empty)  test  T  which  satisfies  C*.  However,  the  question  of  whether  a 
particular  T  satisfies  C  for  subprogram  P  is  undecidable.  In  going  from  the  family  DF  to  the  family 
FDF,  we  have  traded  the  undecidability  of  the  existence  question,  "is  there  any  test  which  is  C- 
adequate  for  P?"  for  the  undecidability  of  the  recognition  problem  "is  a  given  test  C-adequate  for  P?" 

Observe  that  for  any  DF  criterion  C,   C  =>   C*.    We  now  investigate  the  inclusion  relations 
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THE  FEASIBLE  DATA  FLOW  TESTING  CRITERIA 

fdcu(x,i)  =  {]■  €  dcu(x,i)  |  the  association  (i,j,x)  is  executable} 
fdpu(x,i)  =  {(j,k)  i  dpu(x,i)  |  the  association  (i,(j,k),x)  is  executable} 


Test  T  satisfies  criterion  C  for  subprogram  P  if  for  each  node  i  and  each  x  €  def(i)  the  set  ^of  paths 
executed  by  T  covers  the  following  associations: 


CRITERION 

(all-defs)* 


REQUIRED  ASSOCIATIONS 


if  fdcu(x,i)  U  fdpu(x,i)  *  ({>  then 

some  (i,j,x)  s.t  j  €  fdcu(x,i)  or  some  (i,(j,k),x)  s.t.  (j,k)  €  fdpu(x,i). 


(all-p-uses)* 


all  (i,(j,k),x)  s.t.  (j,k)  €  fdpu(x,i). 


(all-p-uses/some-c-uses)*  all  {i,(j,k),x)  s.t.  (j,k)  €  fdpu(x,i). 

In  addition,  if  fdpu(x,i)  =  <i>  and  fdcu(x,i)  t^  <t) 
then  some  (i,j,x)  s.t.  j  (.  fdcu(x,i). 

(all-c-uses/some-p-uses/)*  all  (i,i,k)  s.t.  j  €  fdcu(x,i). 

In  addition,  if  fdcu(x,i)  =  <i>  and  fdpu(x,i)  #  <j) 
then  some  (i,(j,k),x)  s.t.  (j,k)  €  fdpu(x,i). 


(all-uses)* 
(all-du-paths)* 


all  (i,j,x)  s.t.  j  e  fdcu(x,i) 

and  all  (i,(j,k),x)  s.t.  (j,k)  €  fdpu(x,i). 


all  executable  du-paths  with  respect  to  x  from  i  to  j  s.t.  j  €  dcu(x,i) 
and  all  executable  du-paths  with  respect  to  x  from  i  to  (j,k) 
for  each  (j,k)  i  dpu(x,i). 


For  comparison  we  also  define  the  criteria  (all-nodes)*  [respectively  (all-edges)*,  (all-paths)*]  which 
require  that  P cover  each  executable  node  [respectively  each  executable  edge,  each  executable  path.] 


Figure  5 
among  the  FDF  criteria. 

THEOREM  1:  The  family  of  FDF  criteria    is  partially  ordered  by  strict  inclusion  as  shown  in   Figure 

6.    Furthermore,  FDF  criterion  C,*  includes  FDF  criterion  C^*  iff  and  only  if  it  is  explicitly  shown  to 

do  so  in  the  figure  or  it  follows  from  the  transitivity  of  the  relations. 

PROOF: 

A.  Strictness  of  the  inclusions 

We  first  observe  that  if  subprogram  P  has  no  unexecutable  paths  then  a  test  is  C-adequate  for  P 
if  and  only  if  it  is  C*-adequate  for  P.    This  observation,  along  with  the  proofs  of  stricmess  of  the 


■2- 


"% 


vy 


^r 


<r^ 


<3 


I 


o 


^ 

"% 


inclusions  in  Theorem  1  of  [RAP85],  none  of  which  involve  subprograms  with  unexecutable  paths, 
shows  that  all  of  the  inclusions  in  Figure  6  are  strict.  It  thus  suffices  to  show  that  the  inclusions  in 
Figure  6  hold. 

B.l.  (all-paths)*  =>  (all-uses)*  : 

Suppose  not.  Then  there  is  a  subprogram  P  and  a  set  T  of  test  data  which  is  (all-paths)*- 
adequate  for  P  but  not  (all-uses)*-adequate.  Let  P  be  the  set  of  paths  through  P  which  T  executes. 
There  exist  a  node  i  in  P  with  a  global  definition  of  some  variable  x,  a  node  j  with  a  global  c-use  of  x 
or  edge  (j,k)  with  p-use  of  x,  and  an  executable  definition  clear  patti  with  respect  to  x  from  i  to  j 
[respectively  from  i  to  (j,k)]  which  is  not  covered  by  P.  This  contradicts  the  fact  that  P  covers  every 
executable  path. 

The  proofs  that  (all-paths)*  =>  (all-du-paths)'  and  (all-paths)*  =>  (all-edges)*  are  similar  and 
will  be  omitted. 

B.2.  (all-edges)*  =>  (all-nodes)*  : 

Let  T  be  a  test  which  satisfies  (all-edges)*  for  subprogram  P,  and  let  P  be  the  set  of  paths 
executed  by  T.  Let  n  be  any  executable  node  in  P.  If  n  is  the  entry  node,  then  n  has  a  unique 
successor,  m,  and  (n,m)  is  executable.  So  "  covers  (n,m)  and  hence  covers  n.  If  n  is  not  the  entry 
node,  then  since  n  is  executable,  some  branch  (i,n)  is  executable.  So  P  covers  (i,n)  and  hence  covers 
n. 

B.3.  (all-uses)*  =>  (all-p-uses/some-c-uses)*,  (all-p-uses/some-c-uses)*  =>  (all-p-uses)*,  (all-p- 
uses/some-c-uses)"  =>  (all-defs)*,  (all-uses)*  =>  (all-c-uses/some-p-uses)*,  (all-c-uses/some-p-uses)* 
=>  (all-defs)*  : 

These  inclusions  follow  immediately  from  the  definitions  of  the  criteria  given  in  Figure  5.  For 
example,  any  set  P  of  paths  which  covers  all  of  the  associations  required  by  (all-uses)*  will  a  fortiori 
cover  all  of  the  associations  required  by  (all-p-uses/some-c-uses)*. 

We  next  show  that  those  relations  not  in  the  transitive  closure  of  the  diagram  in  Figure  6  do  not 
hold. 
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C.l.  (all-du-paths)*  =)6-  (all-p-uses)';  (all-du-paths)*  =/?  (all-p-uses/some-c-uses)';  (all-du-paths)'  ^ 
(all-uses)';  (all-du-paths)*  z^  (all-c-uses/some-p-uses)';  (all-du-paths)*  =3$.  (all-defs)*;  (all-du-paths)* 
si.  (all-edges)*;  (all-du-paths)*  :J>  (all-nodes)*  : 

It  suffices  to  show  that  (all-du-paths)*  =i>  (all-p-uses)*,  (all-du-paths)*  =)6>  (all-defs)*,  and  (all- 
du-paths)*  ^  (all-nodes)*.  The  rest  follows  from  the  transitivity  of  =>.  Consider  the  subprogram 
shown  in  Figure  7a.  Its  du-paths  are  shown  in  Figure  7b.  Of  these,  only  (1,2),  (2,3,4),  (4,3,4),  and 
(4,3,5)  are  executable  Let  T  =  {(X,Y)}  where  X  is  any  integer  and  Y  <  0.  Since  T  executes  P  = 
{(1,2,3,4,3,4,3,5,6,8)},  T  satisfies  (all-du-paths)*.  However,  P  does  not  cover  the  associations 
(2,(5,7),y),  (2,7,x),  or  node  7,  all  of  which  are  covered  by  the  executable  path  (1,2,3,4,3,4,3,5,7), 
so  T  does  not  satisfy  (all-p-uses)*,  (all-defs)*,  or  (all-nodes)*. 

Intuitively,  (all-du-paths)*  fails  to  include  these  criteria  because  it  is  possible  for  a  subprogram 
to  have  certain  definition-use  associations  which  can  be  executed  only  by  paths  which  traverse  some 
loop  one  or  more  times.  In  section  4  we  shall  introduce  a  modified  version  of  the  (all-du-paths)* 
criterion  which  includes  (all-uses)'  and  (all-edges)',  and  hence  all  of  the  other  FDF  criteria. 

C.2  (all-p-uses)'  =i>  (all-edges)*;  (all-p-uses/some-c-uses)'  si.  (all-edges)';  (all-uses)*  ^  (all- 
edges)';  (all-p-uses)'  =^  (all-nodes)*,  (all-p-uses/some-c-uses)*  =>S>  (all-nodes)*;  (all-uses)*  =^  (all- 
nodes)*  : 

Consider  the  subprogram  in  Figure  8  Notice  that  since  node  3  is  unexecutable,  y  is  always 
uninitialized  at  node  4.  We  will  assume  that  under  these  circumstances,  edges  (4,5)  and  (4,6)  are 
both  executable.  This  would  be  the  case,  for  example,  in  an  environment  in  which  uninitialized 
variables  receive  arbitrary  values.  Since  node  3  is  unexecutable,  the  only  executable  definition-use 
associations  are  (1,2, input),  and  (2,  (2,4),  x).  Let  T  be  a  test  which  executes  P  =  {(1,2,4,5,7,8)}. 
Then  T  satisfies  (all-p-uses)*.  (all-p-uses/some-c-uses)*,  and  (all-uses)',  but  does  not  satisfy  (all- 
edges)*  or  (all-nodes)*. 

Notice  that  the  subprogram  in  Figure  8  has  a  data  flow  anomaly  [OST76].  We  shall  see  below 
that  this  is  not  a  mere  coincidence,  but  that  rather,  it  is  this  particular  kind  of  anomaly  which 
prevents  the  inclusions  from  holding. 
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The  rest  of  the  non-inclusions  follow  from  the  incomparability  and  strictness  proofs  in 
[RAP85].    ■ 

It  seems  discouraging  that  (all-p-uses)*  fails  to  include  (all-edges)'.  Data  flow  testing  was 
developed  in  order  to  "bridge  the  gap"  between  branch  testing  and  path  testing.  Since  many  "real- 
life"  subprograms  cannot  be  adequately  tested  using  the  unstarred  versions  of  the  data  flow  criteria, 
one  would  hope  that  the  FDF  criteria  would  "bridge  the  gap"  between  (all-edges)'  and  (all-paths)*. 
We  have  seen  that  this  is  not  the  case.  We  next  show  that  for  a  certain  class  of  "well-behaved" 
subprograms,  any  test  which  satisfies  (all-p-uses)'  satisfies  (all-edges)*. 

DEFINITION:  We  will  say  that  a  subprogram  P  satisfies  the  No-Feasible-Undefined-P-uses  property 
(NFUP)  if  and  only  if  for  every  executable  edge  (i,j)  in  P  having  a  p-use  of  a  variable  x  there  is 
some  executable  path  from  the  start  node  to  edge  (i,j)  which  contains  a  global  definition  of  x. 

We  note  that  it  is  quite  reasonable  to  expect  subprograms  to  have  property  NFUP.    If  (i,j)  is  an 
edge  which  causes   NFUP  to   fail,  then   any  input  which  causes   (i,j)   to  be  executed  will  involve 
referencing  an  uninitialized  variable. 
THEOREM  2:  For  the  class  of  subprograms  which  satisfy  NFUP,  (all-p-uses)*  =>  (all-edges)*. 

PROOF:  Let  P  be  a  subprogram  satisfying  NFUP,  let  T  be  a  test  which  satisfies  (all-p-uses)*  for  P, 
let  P  be  the  set  of  paths  executed  by  T,  and  let  (i,j)  be  an  executable  edge  in  P.  Suppose  (i,j)  has  a 
p-use  of  a  variable  x.  By  hypothesis  there  is  an  executable  path  -n  from  the  start  node  to  (i,j)  which 
includes  a  global  definition  of  x.  Let  n  be  the  last  node  in  -n  having  a  global  definition  of  x.  Then 
(n,  (i,j),  x)  is  an  executable  definition-p-use  association  so  it  is  covered  by  p.  Hence  (i,j)  is  covered 
by  P. 

If  (i,j)  has  no  p-uses.  then  the  result  follows  by  a  straight-forward  modification  of  the 
corresponding  part  of  the  proof  of  (allp-uses)  =>  (all-edges)  [RAP85].    ■ 

In  [RAP85]  the  class  of  subprograms  to  which  data-flow  testing  applies  is  restricted  to  those 
subprograms  satisfying  the  No-Syntactic-Undefined-P-use  Property  (NSUP): 

For  every  p-use  of  a  variable  x  on  an  edge  (i,j)  in  P,  there  is  some  path  from  the  start  node  to 
edge  (i,j)  which  contains  a  global  definition  of  x. 
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This  restriction  was  necessary  in  order  to  insure  that  all-p-uses  =>  all-edges.  NFUP  is  a  strengthening 
of  NSUP  which  takes  into  account  the  fact  that  in  subprograms  satisfying  NSUP,  the  only  paths  tt 
from  the  entry  node  to  some  p-use  of  x  such  that  x  has  a  global  definition  in  some  node  in  it  might 
be  unexecutable. 

It  is  tempting  to  restrict  the  class  of  programs  to  which  the  FDF  criteria  apply  to  those  satisfying 
NFUP.  It  is  our  feeling  however  that  while  one  can  live  with  the  undecidability  of  the  adequate  test 
recognition  problem  and  perhaps  (albeit  very  uncomfortably)  with  the  undecidability  of  the  adequate 
test  existence  problem,  one  should  at  least  be  able  to  decide  algorithmically  whether  a  given  testing 
strategy  applies  co  a  given  subprogram  Since  it  is  undecidabie  whether  a  given  subprogram  satisfies 
NFUP,  we  refrain  from  requiring  this  property  of  subprograms  to  be  tested. 

Another  possible  way  to  lorce  (all-p-uses)'  to  include  (all-edges)*  would  be  to  require 
subprograms  to  satisfy  the  No-Anomalies  Property  (NA): 

Every  path  from  the  start  node  to  a  use  of  a  variable  x  must  contain  a  definition  of  x. 

Osterweil  and  Fosdick  [OST76]  consider  any  subprogram  not  satisfying  this  property  to  have  data- 
flow anomaly  indicative  of  possible  subprogram  error  Since  NA  is  a  purely  syntactic  property  and 
NA  implies  NFUP  we  could  restrict  FDF  testing  to  subprograms  satisfying  this  property.  We  feel 
that  this  is  overly  restrictive,  since  many  perfectly  good  subprograms  fail  to  satisfy  NA. 

One  way  to  force  .NA  to  be  satisfied  is  to  give  the  entry  node  a  definition  of  each  variable. 
Another  approach  is  to  perform  FDF  testing  in  conjunction  with  a  check  for  data-flow  anomalies. 
For  any  subprogram  which  satisfies  NA  and  any  test  T  which  satisfies  (all-p-uses)*  the  user  will  be 
assured  that  T  satisfies  (all-edges)*  .  In  case  NA  does  not  hold,  the  user  should  explicitly  check 
whether  (all-edges)*  is  satisfied  and  if  necessary  add  more  test  data  or  inspect  the  subprogram  for 
references  of  uninitialized  variables. 

4.    A  Modification  of  the  All-du-paths  Criterion 

We  have  seen  above  that  (all-du-paths)*  fails  to  include  (all-uses)*.  This  is  because  in  some 
programs  the  only  executable  def-clear  paths  with  respect  to  a  variable  x  from  some  definition  of  x  to 
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some  use  of  x  are  paths  with  non-simple  cycles.  In  this  section  we  define  a  modification  of  the  all- 
du-paths     criterion     which     includes     all-uses     and    whose     starred    version     includes    (all-uses)*. 

DEFINITION:  Let  -ir^  =  (n|,n, nj  be  a  du-path  with  respect  to  x.    Then  cycle-extension(iT,x) 

=  {def-clear  paths  with  respect  to  x  of  the  form  (X,,X, X^)  where  each  X,  is  a  path  of  length 

greater  than  or  equal  to  one,  beginning  and  ending  with  n,.} 

Informally,  cycle-extension(iT,x)  is  the  set  of  def-clear  paths  with  respect  to  x  formed  by 
following  IT,  taking  "side-trips"  which  traverse  cycles  zero  or  more  times.  For  any  du-path  i:  with 
respect  to  x,  ir  €  cycle-extension(iT,x). 

Our  modified  version  of  all-du-paths  requires  that  one  element  of  the  cycle-extension  of  each 
du-path  be  selected.    We  now  define  this  criterion  formally  and  investigate  some  of  its  properties. 
DEFINITION:    A    test   T   which    executes   the    set    P  of   paths   satisfies    the    cycle-extended-du-paths 
criterion  for  a  program  P  if  and  only  if  for  each  variable  x  and  each  du-path  tt     with  respect  to  x,  P 
covers  some  path  it'  €  cycle-extension(ir,x). 

LEMMA:  Let  it  be  a  def-clear  path  with  respect  to  x  from  a  node  i  containing  a  definition  of  x  to  a 
node  j  or  edge  (j,k)  containing  a  use  of  x.  Then  there  is  a  (not  necessarily  unique)  du-path  it'  such 
that  TT  €  cycle-extension('7r',x). 

PROOF:   Let  -rr     =    («,,«,,  ...  ,n.)  be  a  definition-clear  path   with  respect  to  x.     The  following 

algorithm  decomposes  -n  into  the  form  (X|,X, X,,)  where  for  1  s  j  <  h  <k,  X    begins  and  ends 

with  n,  and  it'  =  (n,   -  ■  ■  n,  )  is  a  du-path  with  respect  to  x.    Thus  it  €  cycle-extension(T7',x). 

begin 

{initialize  cycle-traversals  to  the  empty  path.} 
for  j  :=   1  to  k  do  X,  :=  (  ); 

It'  :=  (); 

i:=   1; 

h  :=  0; 

while  i  £  k  do 

begin 

h  :=  h  ^   1; 

■n'  :=  concat(iT',n,); 

{scan  rest  of  path  looking  for  last  occurrence  of  n,} 
for  j  :=  i  to  k  do 

if  n,  =  n.  then  last  :=  j; 


-  14- 

If  (i  =   1)  and  (j  =  k)  then  last  :=  1;  {special  case} 

{X;,  is  subpath  from  n,  to  1/^,,} 
for  j  :=  i  to  last  do  concat(\;,,fi^); 

{set  i  to  position  of  first  node  in  next  subpath  } 
i  :=  last  +  1; 

end    {while} — 

end; 

The  algorithm  involves  processing  it  from  left  to  right,  extracting  maximal  length  subpaths 
which  begin  and  end  with  the  same  node.  The  concatenation  of  the  first  nodes  in  these  subpaths  is 
the  desired  du-path.  Some  special  care  is  required  when  -n  begins  and  ends  with  the  same  node.  In 
this  case,  X,  is  set  to  the  subpath  (n,)  and  the  du-path  which  is  eventually  produced  is  a  simple  cycle. 
■ 
THEOREM  3:  all-du-paths  =>  cycle-extended-du-paths  =>  all-uses 

PROOF:  The  first  inclusion  follov/s  immediately  from  the  fact  that  every  du-path  belongs  to  its  own 
cycle-extension.  Suppose  cyclc-e<tended-du-paths  ^  all-uses.  There  is  a  program  P,  a  test  T 
executing  the  set  of  paths  P  through  P  which  satisfies  cycle-extended-du-paths,  and  some  definition- 
use  association  which  is  not  covered  by  P.  Assume  that  this  association  is  a  definition-c-use 
association,  (i,j,x).  The  proof  for  a  definition-p-use  association  is  nearly  identical.  Let  it  be  a  def- 
clear  path  with  respect  to  x  from  i  to  j.  By  the  lemma,  there  is  a  du-path  tr'  with  respect  to  x  such 
that  IT  i  cycle-extension(Tr',x).  Since  T  satisfies  cycle-extended-du-paths,  there  is  a  path  it"  € 
cycle-extension(':r',x)  such  that  P  covers  -it",  -it"  is  a  def-clear  path  with  respect  to  x  from  i  to  j.  So 
,P covers  the  association  (i,j,x).    Contradiction. 

It  remains  to  show  that  the  inclusions  are  strict.  For  any  program  P  with  no  loops,  a  test 
satisfies  all-du-paths  for  P  if  and  only  if  it  satisfies  cycle-extended-du-paths  for  P.  This  fact,  along 
with  the  proof  in  [RAP85]  that  all-du-paths  strictly  includes  all-uses  (which  involved  a  loop-free 
program)  shows  that  cycle-extended-du-paths  strictly  includes  all-uses. 

To  see  that  all-du-paths  strictly  includes  cycle-extended-du-paths,  consider  the  program  in 
Figure  9.  The  test  {2}  which  executes  P  =  {(1,2,3,4,3,4,3,5,6)}  satisfies  cycle-extended-du-paths  but 
does  not  cover  the  du-path  (2,3,5).    ■ 
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We  next  investigate  the  criterion  (cycle-extended-du-paths)*. 
DEFINITION:  A  test  T  which  executes  the  set   P  of  paths  satisfies  the  (cycle-extended-du-paths)' 
criterion  for  a  program  P  if  and  only  if  for  each  variable  x  and  each  du-path  -n  with  respect  to  x  for 
which  cycle-extension(Tr,x)  contains  at  least  one  executable  path,  P  covers  some  path  t:'   €   cycle- 
extensior.(Tr,x). 
THEOREM  4:  (all-paths)*  ^  (cycle-extended-du-paths)*  =>  (all-uses)* 

PROOF:  It  is  obvious  that  (all-paths)*  includes  (cycle-extended-du-paths)*.  The  proof  that  (cycle- 
extended-du-paths)*  =i>  (a'l-Mses)*  is  similar  to  the  unstarred  case.  Suppose  (cycle-extended-du- 
paths)'  ijt-  (all-uses)*.  Then  for  some  subprogram  P  and  set  P  of  paths  through  P  executed  by  a  test 
which  satisfies  (cycle-extended-du-paths)*  there  is  an  executable  definition-use  association  which  is 
not  covered  by  P.  Assume  that  this  association  is  a  definition-c-use  association,  (i,j,x).  The  proof  for 
a  definition-p-use  association  is  nearly  identical.  Since  (i,j,x)  is  executable,  there  is  an  executable 
definition-clear  with  respect  to  x  path  ti  from  i  to  j.  By  the  lemma,  t:  €  cycle-extension(T:',x)  for 
some  du-path  with  respect  to  x,  tt'  from  i  to  j.  Since  the  subset  of  cycle-extension(iT',x)  consisting 
of  executable  paths  is  non-empty  (it  contains  it)  there  is  some  executable  path  tt"  €  cycle- 
extension(TT',x)  which  is  covered  by  P.  it"  is  a  definition-clear  path  with  respect  to  x  from  i  to  j  so  f 
covers  the  association  (i,j,x). 

The  strictness  of  the  inclusion  follows  from  the  proof  of  the  strictness  of  cycle-extended-du- 
paths  =>  all-uses  since  the  program  in  Figure  9  has  no  unexecutable  paths.    ■ 

5.    Conclusions 

We  have  introduced  a  new  family  of  path  selection  criteria  derived  from  the  data  flow  testing 
criteria  and  explored  the  relationships  among  them.  These  criteria,  the  feasible  data  flow  testing 
criteria,  are  obtained  from  the  corresponding  data  flow  testing  criteria  by  eliminating  unexecutable 
associations  from  consideration. 

For  a  large  class  of  "well-behaved"  programs,  the  FDF  criteria  (all-p-uses)*,  (all-p-uses/some- 
c-uses)*,  and  (all-uses)*  "bridge  the  gap"  between  (all-branches)*  and  (all-paths)*  in  the  same  way 
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that  the  corresponding  DF  criteria  do.    For  certain  programs  with  anomalies  however,  there  are  tests 
which  satisfy  (all-p-uses)'  without  satisfying  (all-edges)*. 

Although  (all-paths)  =>  (all-du-paths)  =>  (all-uses),  (all-du-paths)'  does  not  even  include  (all- 
nodes)*.  We  have  defined  a  new  criterion,  (cycle-extended-du-paths)  such  that  (all-paths)  =>  (cycle- 
extended-du-paths)  =>  (all-uses)  and  (all-paths)'  =>  (cycle-extended-du-paths)*  =>  (all-uses)*. 

The  advantage  of  the  FDF  criteria  over  the  DF  criteria  is  that  they  satisfy  the  applicability 
property:  for  every  subprogram  P  and  every  FDF  criterion  C  there  is  some  set  of  paths  which  is  C- 
adequate  for  P.  The  DF  criteria  do  not  satisfy  this  property.  The  disadvantage  of  the  FDF  criteria  is 
that  it  is  undecidable  whether  a  particular  set  of  paths  is  C-adequatc  for  P.  Thus,  in  deciding 
whether  to  use  the  DF  criteria  or  the  FDF  criteria,  one  is  faced  wiih  a  trade-off  between  applicability 
and  automatability. 

Although  it  is  in  general  undecidable  whether  a  given  association  is  executable,  it  is  often  easy 
for  a  person  looking  at  a  subprogram  to  determine  whether  or  not  a  particular  association  is 
executable.  Sometimes  this  requires  very  little  semantic  information.  For  example,  any  program 
with  a  for  loop  in  which  the  upper  bound  is  always  greater  than  or  equal  to  the  lower  bound  has  an 
unexecutable  definition-p-use  association.  In  other  cases,  determining  whether  a  given  association  is 
executable  seems  to  require  "high-level"  understanding  of  how  the  subprogram  and  other 
subprograms  which  it  calls  operate. 

We  plan  to  enhance  ASSET,  our  data  flow  testing  tool,  by  adding  heuristics  which  attempt  to 
determine  which  of  the  required  associations  are  unexecutable.  When  the  heuristics  cannot  decide 
whether  or  not  a  particular  association  is  executable,  the  person  using  the  tool  will  have  to  intervene. 
We  hope  that  this  approach  will  prove  to  be  a  practical  way  to  preserve  the  applicability  property 
enjoyed  by  the  FDF  criteria  while  sacrificing  automatability  to  as  small  extent  as  possible. 
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